-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dali usage #1639
Dali usage #1639
Conversation
@zhreshold: You could start reviewing this PR. The log files you asked about are here |
Job PR-1639-2023252 is done. |
Job PR-1639-dec5bc8 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for my delayed response, can you take a look at the installation and error catching issue again?
@@ -13,18 +12,29 @@ | |||
from gluoncv.model_zoo import get_model | |||
from gluoncv.utils import makedirs, LRSequential, LRScheduler | |||
|
|||
import dali |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we catch the ImportError here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely, I could add catching of this exception. Actually, it could happen only whentrain_imagenet.py
was updated, but for some reason, dali.py
was not downloaded. I will do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I added the ImportError
exception which should happen when dali-gpu
OR dali-cpu
is used, but dali.py
is not in scripts/classification/imagenet
Job PR-1639-a0c1ee2 is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested all three backends. LGTM except for the import issue.
Job PR-1639-5dccefa is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Script to run regular training gluonCV with and without Data Extension Library (DALI).:
To launch training use scripts/classification/imagenet/test.sh, which will install DALI.
Before launching this script you can set following environment variables:
MODEL, NUM_TRAINING_SAMPLES, NUM_EPOCHS, DATA_BACKEND, TRAIN_DATA_DIR
If some (or all) of these variables are NOT set, the script will use their default values:
To launch this script with DALI, the environment variable DATA_BACKEND should be set to
dali-gpu
ORdali-cpu
Following charts shows
the improvement of performance when DALI is used:
and the accuracy achieved for 3 epochs: